Population growth is highly associated with the economic development and the migrant inflow to the country. However, such factors as health development and demographic structure may sometimes be even more significant in their force than economic development as health of the nation may predetermine the economic growth. Report on the Commission on Macroeconomics and Health (CMH, 2001) states that health plays an important role in the population growth and the economic development because of the correlation between the level of income and the burden of illness. As the result of the higher economic growth there are more possibilities on implementing programs for improving the health and increase population growth. These proves the interconnections between each other and it is becoming obvious that the issue of the health development is of high importance. However, it is influencing the welfare and the economic development through the population growth. So, the aim of the research is to check for the possible demographic and health factors which may influence the population growth. In the case of the demographic factor I will include the fertility rate which is one of the factors affecting population growth directly. The study by Joshi and Schultz (2007) have shown that usually all the population increase programs are voluntary and they are aiming at the increase of the fertility which is a demographic side of population growth issue. The results of his study has also shown that in the case rate of the fertility increase, the welfare of families, on average, increases as well. From that I may conclude that fertility and the population growth have one of the strongest impact on the welfare of the society which is an ultimate goal of every nation. However, it is quite difficult to evaluate the effect of the fertility rate increase in the country (Moffit, 2005).
Another factor which will be tested in the research is health which can affect population growth through different mechanisms (Bloom and Cannin, 2008). The first one is through higher saving rates because the healthier the person the less he or she has to spend on the curing different illnesses and the more disposable income the person posses and as a result, ability to have children. Another mechanism is through education as in the case of good childhood health, they skip school less and increase their cognitive abilities together with a human capital in the country which affects economic and population growth. Also, The Mexican Commission on Macroeconomics and Health (2004) states that health development should be treated as an asset in which every country has to invest as it will minimize the number of sick people and avoid increases in unemployment rates, mortality rates and as a result decrease in GDP.
All in all, I will try to search for the factors of the population growth through health and demographic factors.
As the aim of the research is to find the demographic and health factors which may affect the population, I will be dealing with a cross section data and analyze the effects for high income countries. The data is taken form the World Bank Databank for the year of 2013. The main regression equation is presented down below:
Pop= β0 + β1agedep + β2limmuni + β3sanit + β4sqwage + β5wage + β6tuberculosis + β7fertility + β8survival65 + β9unemp + β10lunemp
The dependent variable of my choice Pop is growth rate of mid-year population for the year of 2013 which is measured in percents. The population includes all residents of a country independently from their citizenship, but excludes refugees, asylum-seekers.
The first independent variable – agedep, represents the age dependency ratio of people of ages older than 15 and younger than 64 (expressed as a proportion per 100 working-age population). I expect the variable to have a positive effect on the population growth as people of these ages are more likely to have children. Nevertheless, it may have a very strong effect on the dependent variable as both of them account for similar indices.
The next explanatory variable – limmuni, the data for adequate immunization of children who received vaccines against diphtheria, pertussis (whooping cough), and tetanus (DPT). I’ve implemented the log function as I’ve wanted to normalize the data and present better interpretation of regression results. I anticipate the variable to have a positive sign after the regression, as those diseases are extremely dangerous for children at young age; hence, if the condition of vaccination isn’t met, population growth may be lower. Despite that, as I perform the regression on the data for high-income countries, the variable should not have a strong effect as the level of income often goes along with the development of medical system in a country.
Percentage of the population who uses improved sanitation facilities sanit (flush/pour flush, composting toilet, etc.) is our next independent variable. I expect it not to have as high impact as on the low-income countries because it follows a similar logic to the variable limmuni: high-income countries are perceived a priori as having good sanitation facilities. That’s why I expect the sign of the regressed variable to be positive, but the degree of impact – to be low.
The fourth and the fifth explanatory variables are squared wage and salaried workers expressed as a percentage of total number of employed sqwage and wage. I assume that higher wages of people will give them more security and provide them with an incentive of having more children. I include the squared term as I do not expect the scope of the increasing wages to have the same effect on the population growth over time. That is why my expectation for wage is to have a positive sign, but for sqwage - to be negative.
I also include in our regression the variable tuberculosis that stands for the successful treatment of tuberculosis as a percentage of all new tuberculosis cases. The “success” is defined when a patient completed treatment in a tuberculosis control program. The variable is expected to have positive effect on the population growth, as it prevents the disease and lowers mortality.
The seventh explanatory variable fertility represents the fertility rate, i.e. how many children a woman could have gave birth to. As it is one of the major drivers of the population growth, hence the impact is predicted to be strong and positive.
I also consider survival65 - variable survival rates to age 65 of a cohort of newborn infants when subject to current age-specific mortality rates. The variable is expected to have positive influence on the population growth as higher survival rates give larger time span to have more children. Nevertheless, in the high income-countries the effect probably won’t be of the same significance as in the low-income ones.
Additionally, I include 2 variables unemp,lunemp that deal with unemployment of females (percentage of female labour force) and long-term unemployment of both genders as a percentage of total unemployment in a country. As for females I would expect the effect to be positive: if a female is unemployed over time, but is married, for example, she may consider having children. In case of long-term unemployment impact on the population growth – the effect is expected to be negative as well: unemployment assumes low or even absent wages, as children require a lot of investment, it may be simply not feasible for people to have more children when experiencing unemployment.
setwd("/Users/skakunyuliya/IODS-final/IODS-final")
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
dim(data)
## [1] 79 10
str(data)
## 'data.frame': 79 obs. of 10 variables:
## $ country : Factor w/ 79 levels "Andorra","Antigua and Barbuda",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ agedep : num NA 47.2 56.7 44.4 49.6 ...
## $ fertility : num NA 2.09 2.33 1.67 1.92 ...
## $ imuni : int 95 98 94 NA 94 76 92 99 87 96 ...
## $ sanit : num 100 NA 95.8 97.7 100 100 92 99.2 95.5 99.5 ...
## $ survival65 : num NA 7.12 10.7 11.33 14.38 ...
## $ pop : num -4.4 1.023 1.047 0.514 1.734 ...
## $ tuberculosis: int 60 67 51 NA 85 72 76 NA 100 79 ...
## $ wage : num NA NA 75.3 NA 89.6 ...
## $ unemp : num NA NA 8.6 NA 5.6 ...
The data was download fron the World Bank. It is a cross section data for 2013. No data wrangling have been done.
Data contains 79 observations (countries) that are listed below
| Countries | ||
|---|---|---|
| Andorra | Croatia | Hungary |
| Antigua and Barbuda | Curacao | Iceland |
| Argentina | Cyprus | Ireland |
| Aruba | Czech Republic | Isle of Man |
| Australia | Denmark | Israel |
| Austria | Equatorial Guinea | Italy |
| Bahamas, The | Estonia | Japan |
| Bahrain | Faeroe Islands | Korea, Rep. Barbados |
| Belgium | France | Latvia |
| Bermuda | French Polynesia | Liechtenstein |
| Brunei Darussalam | Germany | Lithuania |
| Canada | Greece | Luxembourg |
| Cayman Islands | Greenland Macao | SAR, China |
| Channel Islands | Guam | Malta |
| Chile | Hong Kong SAR, China | Monaco |
| Netherlands | New Caledonia | Portugal |
| Saudi Arabia | New Zealand | Puerto Rico |
| Seychelles | Northern Mariana Islands | Qatar |
| Singapore | Norway | Russian Federation |
| Sint Maarten (Dutch part) | Oman | San Marino |
| Slovak Republic | Poland | United Arab Emirates |
| Slovenia | St. Martin (French part) | United Kingdom |
| Spain | Sweden | United States |
| St. Kitts and Nevis | Switzerland | Uruguay |
| Venezuela, RB | Trinidad and Tobago | Virgin Islands (U.S.) |
| Turks and Caicos Islands | ||
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
summary(data)
## country agedep fertility imuni
## Andorra : 1 Min. :17.18 Min. :1.124 Min. :42.00
## Antigua and Barbuda: 1 1st Qu.:42.40 1st Qu.:1.454 1st Qu.:92.00
## Argentina : 1 Median :48.99 Median :1.772 Median :95.00
## Aruba : 1 Mean :47.10 Mean :1.823 Mean :93.36
## Australia : 1 3rd Qu.:52.34 3rd Qu.:2.016 3rd Qu.:98.00
## Austria : 1 Max. :73.80 Max. :4.924 Max. :99.00
## (Other) :73 NA's :14 NA's :11 NA's :20
## sanit survival65 pop tuberculosis
## Min. : 72.20 Min. : 0.9419 Min. :-4.3997 Min. : 0.00
## 1st Qu.: 96.60 1st Qu.: 9.8954 1st Qu.: 0.2605 1st Qu.: 62.00
## Median : 98.80 Median :14.0171 Median : 0.7566 Median : 77.00
## Mean : 96.76 Mean :13.1440 Mean : 0.8975 Mean : 72.92
## 3rd Qu.: 99.90 3rd Qu.:17.7501 3rd Qu.: 1.2076 3rd Qu.: 84.00
## Max. :100.00 Max. :25.0093 Max. : 9.7155 Max. :100.00
## NA's :16 NA's :14 NA's :18
## wage unemp
## Min. :59.20 Min. : 1.600
## 1st Qu.:82.33 1st Qu.: 5.175
## Median :86.35 Median : 7.400
## Mean :84.95 Mean : 8.984
## 3rd Qu.:89.67 3rd Qu.:11.125
## Max. :99.30 Max. :31.300
## NA's :31 NA's :23
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
library(GGally)
library(ggplot2)
pic <- ggpairs(data, mapping = aes( alpha = 0.3, col="orange"), cardinality_threshold=79, lower = list(combo = wrap("facethist", bins = 20)))
pic
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
hist(data$fertility, main="Distribution of fertility rate across countries", xlab="fertility", col = "turquoise2")
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
qplot(data$unemp, geom = "histogram", binwidth = 1, main = "Histogram for unemployment", xlab = "unemployment", col = I("black"), fill = I("pink"))
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
boxplot(data, main="Distribution of data across countries", col = "yellow2")
library(dplyr)
library(tidyr)
library(ggplot2)
join= c("imuni","unemp","wage", "tuberculosis","fertility")
data2 <-select(data, one_of(join))
graph = ggpairs(data2, upper = list(continuous = wrap("density", color = "blue")))+ ggtitle("Distribution of data across countries")
graph
Now, I will build the Correlation matrix of independent variables
library(corrplot)
library(dplyr)
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
join= c("imuni","unemp","wage", "tuberculosis","fertility")
data2 <-select(data, one_of(join))
data3 <- data2[complete.cases(data2), ]
cor_matrix<-cor(data3) %>% round(digits=2)
library(dygraphs)
corrplot(cor_matrix, method = "circle", tl.cex = 1, addCoef.col = "black", type="upper")
library(dplyr)
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
join= c("imuni","unemp","wage", "tuberculosis","fertility")
data2 <-select(data, one_of(join))
graph1 = ggpairs(data2, lower = list(continuous = wrap("smooth", method = "lm", color="red")))
graph1
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
qplot(agedep, country, data = data) + xlab("Age dependency") + ylab("Country") + ggtitle("Age Dependency vs. Country") + geom_smooth(method = "lm")
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
qplot(fertility, country, data = data) + xlab("Fertility") + ylab("Country") + ggtitle("Fertility vs. Country") + geom_smooth(method = "lm", cl="yellow")
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
qplot(tuberculosis, sanit, data = data) + xlab("Tuberculosis") + ylab("Sanit") + ggtitle("Tuberculosis vs. Sanit") + geom_smooth(method = "lm", col="pink")
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
qplot(imuni, sanit, data = data) + xlab("Immunization") + ylab("Sanit") + ggtitle("Immunization vs. Sanit") + geom_smooth(method = "lm")
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
qplot(pop,fertility, data = data) + xlab("Population") + ylab("Fertility") + ggtitle("Population vs. Fertility") + geom_smooth(method = "lm", col="red")
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
join= c("imuni","unemp","wage", "tuberculosis","fertility")
data2 <-select(data, one_of(join))
gather(data2) %>% ggplot(aes(value, col="pink")) + facet_wrap("key", scales = "free") + geom_histogram()
Brief description of the method you are using in your own words.
Now, I will perform the ordinary least squares regression analysis in order to analyse the factors affecting population growth. OLS estimation make sure the the error between the dependent and independ variables are minimized and in such way the estimation results are the most acurate. Therefore, those variable that will show somee xplanatory power will significant and these will be taken into account when given the conclusion of my analysis.
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
limmuni <- log(imuni)
lunemp <- log(unemp)
sqwage <- sqrt(wage)
data1=cbind(data,limmuni,lunemp,sqwage)
model <- lm(pop ~ agedep+limmuni+sanit+sqwage+wage+tuberculosis+fertility+survival65+unemp+lunemp, data = data1)
summary(model)
##
## Call:
## lm(formula = pop ~ agedep + limmuni + sanit + sqwage + wage +
## tuberculosis + fertility + survival65 + unemp + lunemp, data = data1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.69749 -0.21539 0.00483 0.17197 0.63240
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.684384 20.823519 1.137 0.26502
## agedep -0.031070 0.031758 -0.978 0.33629
## limmuni 0.005666 1.266753 0.004 0.99646
## sanit 0.066475 0.013253 5.016 2.66e-05 ***
## sqwage -6.429617 4.696370 -1.369 0.18186
## wage 0.381801 0.263302 1.450 0.15816
## tuberculosis -0.011944 0.003628 -3.292 0.00269 **
## fertility 0.969611 0.570459 1.700 0.10027
## survival65 -0.064498 0.056212 -1.147 0.26093
## unemp 0.009822 0.039603 0.248 0.80595
## lunemp -0.595506 0.382179 -1.558 0.13042
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3881 on 28 degrees of freedom
## (40 observations deleted due to missingness)
## Multiple R-squared: 0.886, Adjusted R-squared: 0.8453
## F-statistic: 21.76 on 10 and 28 DF, p-value: 1.225e-10
The first regression I will running is based on the cross-sectional data for 79 high income countries (as defined by the World Bank Dataset from which I’ve retrieved the data)
The R-squared estimate for the regression equals to 89%; which suggests that model explains 89% variation in the data. The probability of F-statistic equals to 0%, and that implies that our model is correctly specified. Nevertheless, 4 variables out of 10 are insignificant: limmuni, sqwage, wage and lunemp.
One of the reasons for that many problems is the lack of the data. Not all of the high income countries have data for all of the chosen variables in this specific year; hence, instead of analyzing suggested 79 countries, only 39 are available for testing.
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
limmuni <- log(imuni)
lunemp <- log(unemp)
sqwage <- sqrt(wage)
data1=cbind(data,limmuni,lunemp,sqwage)
model <- lm(pop ~ agedep+limmuni+sanit+sqwage+wage+tuberculosis+fertility+survival65+unemp+lunemp, data = data1)
plot(model, which = c(1, 2, 5), col = "orange", lwd = 3)
Taking those facts into account, I’ve tried to improve the model by eliminating the wage,tuberculosis,lunemp variables from the equation. The rationale behind such a decision is that the first and the third variables were causing correlation; whereas the variable of tuberculosis was lacking the observations. By keeping the squared wage term I will only track whether the effect is diminishing over time or not.
The new model we are going to test for high-income countries is
Pop= β0 + β1agedep + β2limmuni + β3sanit + β4sqwage + β5fertility + β6survival65 + β7unemp
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
data1=cbind(data,limmuni,lunemp,sqwage)
pairs(data1)
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
limmuni <- log(imuni)
lunemp <- log(unemp)
sqwage <- sqrt(wage)
data1=cbind(data,limmuni,lunemp,sqwage)
model2 <- lm(pop ~ agedep+limmuni+sanit+sqwage+fertility+survival65+unemp, data = data1)
summary(model2)
##
## Call:
## lm(formula = pop ~ agedep + limmuni + sanit + sqwage + fertility +
## survival65 + unemp, data = data1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.00891 -0.33309 -0.07132 0.26139 1.45459
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.193151 8.146285 -0.637 0.52784
## agedep -0.073701 0.039166 -1.882 0.06797 .
## limmuni -0.106708 1.666032 -0.064 0.94929
## sanit 0.073617 0.017180 4.285 0.00013 ***
## sqwage 0.049207 0.271526 0.181 0.85721
## fertility 1.562547 0.711519 2.196 0.03461 *
## survival65 -0.003698 0.068765 -0.054 0.95741
## unemp -0.034609 0.014951 -2.315 0.02644 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5365 on 36 degrees of freedom
## (35 observations deleted due to missingness)
## Multiple R-squared: 0.7367, Adjusted R-squared: 0.6855
## F-statistic: 14.39 on 7 and 36 DF, p-value: 9.41e-09
The results of the improved regression suggest that our model explains 77.6% of variation in the data. Even though, it’s lower than in the first model, such a result seems to be more reliable. Additionally, the probability of F-statistics confirms corrects specification of the model. Also, the other noticeable trait is an increased number of observations.
The agedep variable is significant at 1%-level by the rule of thumb. The obtained results are inconsistent with our initial expectations, as the impact on the population growth is estimated as negative. The output states that a 1% increase in the age-dependent population decreases population growth by 8%.
The log of immunization limmuni variable turned out to be insignificant but positive in its sign. This result is actually fulfills my expectations, as higher rates of immunization help to prevent deaths from certain diseases; but, as we’ve mentioned, in high income countries the scope of the impact is relatively low.
The variable sanit is significant at 10%-level. This is another variable that follows my expectations when being important for our model and having a positive impact on population growth. The coefficient states that a 1%-increase in the population who uses facilities that are considered as those that improve sanitation will increase annual population growth rate by 3,93%.
The sqwage variable is not significant to be used in the model. I can conclude that changes in the impact of the wage on population growth over time are minor, but positive.
The variable of fertility rate fertility is significant at 1% level and has a positive impact on the population growth. Such a result is self explanatory as fertility rate is one of the basic determinants of demographic changes. The results suggest that if a woman could have had more children during her fertility years, the population growth rate would have increase by 182%. Such a result seems to be too exaggerated, and could have possibly been enhanced by problems the model carries.
The unemp variable is significant at 5%-level. The variable is suggested to have negative impact on the dependent variable. The outcome is consistent with initial predictions: a 1%-increase in female unemployment will decrease population growth rate by 3%.
The last variable in the regression that stands for survival to age 65 survival65 is significant at 1%-level. The results are not matching my expectation; it turns out that higher survival rates by 1% decrease the population growth rate by 7%. The scope of the impact is quite large; an explanation for such an outcome is that people may not be having children as they care less about what will happen to them after retirement, as, maybe, governments provide them with some social help. All in all, people regard children to the lower extent as a security for future.
data <- read.csv(file='MOOC.csv', header=TRUE, row.names=1, sep=",")
agedep <- data$agedep
fertility <- data$fertility
imuni <- data$imuni
sanit <- data$sanit
survival65 <- data$survival65
pop <- data$pop
tuberculosis <- data$tuberculosis
wage <- data$wage
unemp <- data$unemp
limmuni <- log(imuni)
lunemp <- log(unemp)
sqwage <- sqrt(wage)
data1=cbind(data,limmuni,lunemp,sqwage)
model2 <- lm(pop ~ agedep+limmuni+sanit+sqwage+fertility+survival65+unemp, data = data1)
plot(model2, which = c(1, 2, 5), col = "blue", lwd = 3)
1. The Mexican Commission on Macroeconomics and Health, the development of health (2004) ‘Investing In Health For Economic Development’. The Mexican Commission on Macroeconomics and Health, the development of health
2. Bloom D.E. and Canning D. (2008) ‘Population Health and Economic Growth’. The International Bank for Reconstruction and Development / The World Bank
3. Sachs J.D. (2001) ‘Report of the Commission on Macroeconomics and Health’. World Health Organization
4. Moffitt R. (2005) ‘Remarks on the analysis of causal relationships in population research’. Demography, 42(1), 91-108
5. Joshi S. and Schultz T.P. (2007) ‘Family Planning as an investment in development: evaluation of a program’s consequences in Matlab, Bangladesh’. Economic Growth Center Discussion Paper 951, Yale University, New Haven CT.